Big Data Europe

نویسندگان

  • Hajira Jabeen
  • Phil Archer
  • Simon Scerri
  • Aad Versteden
  • Ivan Ermilov
  • Giannis Mouchakis
  • Jens Lehmann
  • Sören Auer
چکیده

Œe BigDataEurope (BDE) project is developing exactly the kind of computing infrastructure that European stakeholders need when handling large volumes of data in a variety of formats; the results are open-source and their use is completely free. Coordinated by Fraunhofer IAIS, BDE is working directly with partners that represent the seven Societal Challenges identi€ed by the European Commission (Health, Food, Energy, Transport, Climate, Social Sciences and Security). For each community, a pilot that makes use of BDE€s technology stack to address the Big Data needs identi€ed by these challenges is well under way. 1 THE BIG DATA INTEGRATOR PLATFORM BDE’s Integrator Platform (BDI) makes the processing of big data simpler, cheaper and more ƒexible than ever before. It o‚ers basic building blocks to get started with common big data technologies and makes integration of di‚erent technologies and applications easy. Components such as Apache Spark, Hadoop HDFS, Apache Flink, Apache Flume and Apache Kaˆa can be built into a pipeline through a simple graphical UI. Œose components can help handle the velocity and volume dimensions, but BDI is also leading the way in tackling that third big data problem: variety. Œis is done through BDI’s Semantic Data Lake and components like SANSA1 which performs analytics on semantically structured RDF data by providing out-of-the-box scalable algorithms for massive datasets. BDI is an open source platform based on Docker, today’s virtualisation technique of choice. It works on a local machine or on hundreds of nodes using Docker Swarm, and can run in-house, or within an external cloud environment (not provided by BDE). BDE applications are provided as docker containers, making their installation and set-up a 10-minute job. With the help of latest Docker features, BDI o‚ers: • Swarm-based networking • Load Balancing • Service Discovery • Multi-host networking with integrated KV-Store • Fault tolerance Docker Compose helps to create multiple containers on multiple nodes using a single command and a single compose €le. Docker Compose V2 and Docker Swarm aim to implement full integration, 1hŠp://sansa-stack.net/ which means that it is feasible to point a Compose app at a Swarm cluster and make its use possible in the same manner as if a single Docker host is being used. It is notable that the latest Docker components provide greater resemblance to Kubernetes in terms of orchestration features, and Swarm presents a beŠer choice in terms of shi‰ing from a local/development environment to a cluster. Œe BDE Team provides baseline Docker images for Apache Hadoop, Spark, Flink and many others. Components were selected based on the requirements gathered from the seven Societal Challenges. Œus, the Platform makes it feasible to perform a variety of big data tasks, including message passing (Kaˆa, Flume), storage (Hive, Cassandra). Œe platform is able to handle RDF triples at scale using components like FOX, SemaGrow and 4Store; with particular emphasis on the tripli€cation of geospatial data using GeoTriples, Sextant and Strabon. BDI has enriched the Docker platform, a high-level depiction of which is shown in Figure 1, with a layer of supporting services, helping in the setup, maintenance and monitoring of the pipeline and workƒows: • Œe Init daemon allows to de€ne workƒows by monitoring the start-up status of inter-dependent Docker components. • Œe Pipeline Service and Builder are developed to support the creation of workƒows. • Œe Pipeline Monitor front-end demonstrates the current status of the Docker components. • Œe Integrator UI integrates the di‚erent ocialWeb UIs of select pipeline components under one Integrated and personalised view. Furthermore, the Swarm UI visualises the status of a swarm cluster and allows to scale and monitor the cluster services. Figure 1: BDI platform’s high-level modular architecture For BDI platform progress updates please refer to the dedicated page2; or try it out or engage with our community3.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Big data impact on society: a research roadmap for Europe

With its rapid growth and increasing adoption, big data is producing a growing impact in society. Its usage is opening both opportunities such as new business models and economic gains and risks such as privacy violations and discrimination. Europe is in need of a comprehensive strategy to optimise the use of data for a societal benefit and increase the innovation and competitiveness of its pro...

متن کامل

Jose Maria Cavanillas, Edward Curry, and Wolfgang Wahlster (editors): new horizons for a data-driven economy: a roadmap for usage and exploitation of big data in Europe

Read more and get great! That's what the book enPDFd new horizons for a data driven economy a roadmap for usage and exploitation of big data in europe will give for every reader to read this book. This is an on-line book provided in this website. Even this book becomes a choice of someone to read, many in the world also loves it so much. As what we talk, when you read more every page of this ne...

متن کامل

Big-Science facilities in Europe need greater coordination of resources

The leading role in science played by crystallography is heavily dependent on Big-Science facilities. The need for Europe-wide coordination of operational resources in Big Science is discussed with particular reference to neutron sources.

متن کامل

The Big Data Value Chain: Definitions, Concepts, and Theoretical Approaches

The emergence of a new wave of data from sources, such as the Internet of Things, Sensor Networks, Open Data on the Web, data from mobile applications, social network data, together with the natural growth of datasets inside organisations (Manyika et al. 2011), creates a demand for new data management strategies which can cope with these new scales of data environments. Big data is an emerging ...

متن کامل

Societal impacts of big data: challenges and opportunities in Europe

This paper presents the risks and opportunities of big data and the potential social benefits it can bring. The research is based on an analysis of the societal impacts observed in a set of six case studies across different European sectors. These impacts are divided into economic, social and ethical, legal and political impacts, and affect areas such as improved efficiency, innovation and deci...

متن کامل

The stakes of Big Data in the IT industry China as the next global challenger?

The information society relies on services for communicating, sharing, networking, searching, buying, etc. which are mostly provided by large corporations, such as Google, Facebook, or Amazon. The Web connects all regions in the World, but its most popular services are ensured by a handful of corporations which are almost all in the USA. While Europe is relying on the American industry in an es...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017